Methods for Batch Processing of Data Mining Queries
نویسندگان
چکیده
Data mining is a useful decision support technique, which can be used to find trends and regularities in warehouses of corporate data. A serious problem of its practical applications is long processing time required by data mining algorithms. Current systems consume minutes or hours to answer single requests, while typically batches of the requests are delivered the systems. In this paper we present the problem of batch processing of data mining requests. We introduce methods that analyze similarities between separate requests to reduce the processing cost. We also perform a comparative performance analysis of the proposed methods.
منابع مشابه
Integrated Candidate Generation in Processing Batches of Frequent Itemset Queries using Apriori
Frequent itemset mining can be regarded as advanced database querying where a user specifies constraints on the source dataset and patterns to be discovered. Since such frequent itemset queries can be submitted to the data mining system in batches, a natural question arises whether a batch of queries can be processed more efficiently than by executing each query individually. So far, two method...
متن کاملDesign and Test of the Real-time Text mining dashboard for Twitter
One of today's major research trends in the field of information systems is the discovery of implicit knowledge hidden in dataset that is currently being produced at high speed, large volumes and with a wide variety of formats. Data with such features is called big data. Extracting, processing, and visualizing the huge amount of data, today has become one of the concerns of data science scholar...
متن کاملOMEGA: An Order-Preserving SubMatrix Mining, Indexing and Search Tool
Order-Preserving SubMatrix (OPSM) has been accepted as a significant tool in modelling biologically meaningful subspace cluster, to discover the general tendency of gene expressions across a subset of conditions. Existing OPSM processing tools focus on giving a or some batch mining techniques, and are time-consuming and do not consider to support OPSM queries. To address the problems, the paper...
متن کاملOptimizing a Sequence of Frequent Pattern Queries
Discovery of frequent patterns is a very important data mining problem with numerous applications. Frequent pattern mining is often regarded as advanced querying where a user specifies the source dataset and pattern constraints using a given constraint model. A significant amount of research on efficient processing of frequent pattern queries has been done in recent years, focusing mainly on co...
متن کاملAnalysis of Pre-processing and Post-processing Methods and Using Data Mining to Diagnose Heart Diseases
Today, a great deal of data is generated in the medical field. Acquiring useful knowledge from this raw data requires data processing and detection of meaningful patterns and this objective can be achieved through data mining. Using data mining to diagnose and prognose heart diseases has become one of the areas of interest for researchers in recent years. In this study, the literature on the ap...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2002